Efficient Search in Big Data using Refinder
نویسنده
چکیده
Inthis paper, we present a context-based information refinding system called ReFinder. It leverages human’s natural recall characteristics and allows users to refind files and Web pages according to the previous access context. ReFinderrefinds information based on a query-by-context model over a context memory snapshot, linking to the accessed information contents. Context instances in the memory snapshot are organized in a clustered and associated manner, and dynamically evolve in life cycles to mimic brain memory’s decay and reinforcement phenomena. We evaluate the scalability of ReFinder on a large synthetic data set. The experimental results show that consistent degradation of context instances in the context memory and the ones in user’s refinding requests can lead to the best refinding precision and recall. An 8-week user study is also conducted to examine the applicability of ReFinder. Initial findings show that time, place, and activity could serve as useful recall clues. On average, 15.53 seconds are needed to complete a refinding request with ReFinder and 84.42 seconds with other existing methods. Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast developmentof networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineeringdomains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the featuresof the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective. This data-driven modelinvolves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacyconsiderations. We analyze the challenging issues in the data-driven model and also in the Big Data revolution. Some further possible improvement of ReFinder is also discussed at the end of the paper.
منابع مشابه
OPTIMUM DESIGN OF DOUBLE CURVATURE ARCH DAMS USING A QUICK HYBRID CHARGED SYSTEM SEARCH ALGORITHM
This paper presents an efficient optimization procedure to find the optimal shapes of double curvature arch dams considering fluid–structure interaction subject to earthquake loading. The optimization is carried out using a combination of the magnetic charged system search, big bang-big crunch algorithm and artificial neural network methods. Performing the finite element analysis dur...
متن کاملEVALUATING EFFICIENCY OF BIG-BANG BIG-CRUNCH ALGORITHM IN BENCHMARK ENGINEERING OPTIMIZATION PROBLEMS
Engineering optimization needs easy-to-use and efficient optimization tools that can be employed for practical purposes. In this context, stochastic search techniques have good reputation and wide acceptability as being powerful tools for solving complex engineering optimization problems. However, increased complexity of some metaheuristic algorithms sometimes makes it difficult for engineers t...
متن کاملCONSTRAINED BIG BANG-BIG CRUNCH ALGORITHM FOR OPTIMAL SOLUTION OF LARGE SCALE RESERVOIR OPERATION PROBLEM
A constrained version of the Big Bang-Big Crunch algorithm for the efficient solution of the optimal reservoir operation problems is proposed in this paper. Big Bang-Big Crunch (BB-BC) algorithm is a new meta-heuristic population-based algorithm that relies on one of the theories of the evolution of universe namely, the Big Bang and Big Crunch theory. An improved formulation of the algorithm na...
متن کاملSurvey on Perception of People Regarding Utilization of Computer Science & Information Technology in Manipulation of Big Data, Disease Detection & Drug Discovery
this research explores the manipulation of biomedical big data and diseases detection using automated computing mechanisms. As efficient and cost effective way to discover disease and drug is important for a society so computer aided automated system is a must. This paper aims to understand the importance of computer aided automated system among the people. The analysis result from collected da...
متن کاملVerification of unemployment benefits’ claims using Classifier Combination method
Unemployment insurance is one of the most popular insurance types in the modern world. The Social Security Organization is responsible for checking the unemployment benefits of individuals supported by unemployment insurance. Hand-crafted evaluation of unemployment claims requires a big deal of time and money. Data mining and machine learning as two efficient tools for data analysis can assist ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015